Reinforcement Learning for Multi-purpose Schedules
نویسندگان
چکیده
In this paper, we present a learning technique for determining schedules for general devices that focus on a combination of two objectives. These objectives are user-convenience and gains in energy savings. The proposed learning algorithm is based on Fitted-Q Iteration (FQI) and analyzes the usage and the users of a particular device to decide upon the appropriate profile of start-up and shutdown times of that equipment. The algorithm is experimentally evaluated on real-life data to discover that close-to-optimal control policies can be learned on a short timespan of a only few iterations. Our results show that the algorithm is capable of proposing intelligent schedules depending on which objective the user placed more or less emphasis on.
منابع مشابه
Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach
This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...
متن کاملStochastic reinforcement benefits skill acquisition.
Learning complex skills is driven by reinforcement, which facilitates both online within-session gains and retention of the acquired skills. Yet, in ecologically relevant situations, skills are often acquired when mapping between actions and rewarding outcomes is unknown to the learning agent, resulting in reinforcement schedules of a stochastic nature. Here we trained subjects on a visuomotor ...
متن کاملSupporting Transparent Thread Assignment in Heterogeneous Multicore Processors Using Reinforcement Learning
Heterogeneity in multicore processor systems creates challenges in effectively mapping processes to diverse cores. While most approaches require programmer partitioning between core types or permutation of thread schedules to find the optimal mapping, we introduce a new machine learning approach to automated thread assignment. We train a reinforcement learning agent to assign threads to the bes...
متن کاملAn Evaluation of the Effects of Fixed-Time Schedules on Response Maintenance
Response-independent schedules of reinforcement (e.g., fixed-time schedules) have typically been shown to decrease the rate of responding. However, researchers have suggested that responses may maintain under response-independent schedules, although it is currently unclear as to what mechanisms are responsible for this maintenance. The purposes of the current study were to (a) replicate previou...
متن کاملOperant conditioning
Operant behavior is behavior "controlled" by its consequences. In practice, operant conditioning is the study of reversible behavior maintained by reinforcement schedules. We review empirical studies and theoretical approaches to two large classes of operant behavior: interval timing and choice. We discuss cognitive versus behavioral approaches to timing, the "gap" experiment and its implicatio...
متن کامل